Strategies for Parallel Data

نویسنده

  • David Skillicorn
چکیده

These explanations can be opaque—providing a means of classifying variation without any explanation of why it works— or transparent—describing what creates the observed variation. For two reasons, data mining could be the killer application that parallel computing has been seeking. First, analyzing variation appears to be algorithmically complex and hence might require levels of computing power that only parallel computers can provide in a timely way. (See the sidebar, “Why parallelize data mining?”) Second, the data sets involved are large and rapidly growing larger, and parallel computers are organized to handle such large volumes effectively, although some data-mining problems are already taxing their limits. Data-mining algorithms can potentially be parallelized in many different ways. Because designing and implementing parallel programs is expensive, it is impractical to test all of these by building implementations and comparing them. Fortunately, practical complexity measures for parallel programming are rapidly maturing. This article shows how to use such measures to assess different parallelization strategies for data-mining algorithms. We also show how the structure of some algorithms results in a doublespeedup phenomenon. A high-level cost analysis greatly simplifies the search for an effective algorithm.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Two Strategies Based on Meta-Heuristic Algorithms for Parallel Row Ordering Problem (PROP)

Proper arrangement of facility layout is a key issue in management that influences efficiency and the profitability of the manufacturing systems. Parallel Row Ordering Problem (PROP) is a special case of facility layout problem and consists of looking for the best location of n facilities while similar facilities (facilities which has some characteristics in common) should be arranged in a row ...

متن کامل

A redundancy allocation problem with the choice of redundancy strategies by a memetic algorithm

This paper proposes an efficient algorithm based on memetic algorithm (MA) for a redundancy allocation problem without component mixing (RAPCM) in a series-parallel system when the redundancy strategy can be chosen for individual subsystems. Majority of the solution methods for the general RAPCM assume that the type of a redundancy strategy for each subsystem is pre-determined and known a prior...

متن کامل

A Memtic genetic algorithm for a redundancy allocation problem

Abstract In general redundancy allocation problems the redundancy strategy for each subsystem is predetermined. Tavakkoli- Moghaddam presented a series-parallel redundancy allocation problem with mixing components (RAPMC) in which the redundancy strategy can be chosen for individual subsystems. In this paper, we present a bi-objective redundancy allocation when the redundancy strategies for...

متن کامل

Parleda: a Library for Parallel Processing in Computational Geometry Applications

ParLeda is a software library that provides the basic primitives needed for parallel implementation of computational geometry applications. It can also be used in implementing a parallel application that uses geometric data structures. The parallel model that we use is based on a new heterogeneous parallel model named HBSP, which is based on BSP and is introduced here. ParLeda uses two main lib...

متن کامل

استخراج پیکره‌ موازی از اسناد قابل‌مقایسه برای بهبود کیفیت ترجمه در سیستم‌های ترجمه ماشینی

Data used for training statistical machine translation method are usually prepared from three resources: parallel, non-parallel and comparable text corpora. Parallel corpora are an ideal resource for translation but due to lack of these kinds of texts, non-parallel and comparable corpora are used either for parallel text extraction. Most of existing methods for exploiting comparable corpora loo...

متن کامل

A New Parallel Matrix Multiplication Method Adapted on Fibonacci Hypercube Structure

The objective of this study was to develop a new optimal parallel algorithm for matrix multiplication which could run on a Fibonacci Hypercube structure. Most of the popular algorithms for parallel matrix multiplication can not run on Fibonacci Hypercube structure, therefore giving a method that can be run on all structures especially Fibonacci Hypercube structure is necessary for parallel matr...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1999